Search CORE

Public Library of Science (PLOS)

Texas ScholarWorks

FigShare

Tensor Decomposition Reveals Concurrent Evolutionary Convergences and Divergences and Correlations with Structural Motifs in Ribosomal RNA

Author: Andrew M. Gross
BT Wimberly
Chaitanya Muralidhara
CP Ridley
CR Vossbrinck
CR Woese
CR Woese
CR Woese
EA Doherty
EW Sayers
F Schluenzen
FH Crick
G Lentzen
GH Golub
J Cadima
J Isaksson
JH Cate
JI Sagara
JJ Cannone
JM Ogle
L Lancaster
L Omberg
L Omberg
LE Orgel
N Ban
O Alter
O Alter
O Alter
Orly Alter
P Nissen
Purificación López-García
Robin R. Gutell
RP Hirt
RR Gutell
RR Gutell
RR Gutell
S Tavazoie
S Winker
SL Baldauf
SR Eddy
WJ Bock
Publication venue: Public Library of Science
Publication date: 29/04/2011
Field of study

Evolutionary relationships among organisms are commonly described by using a hierarchy derived from comparisons of ribosomal RNA (rRNA) sequences. We propose that even on the level of a single rRNA molecule, an organism's evolution is composed of multiple pathways due to concurrent forces that act independently upon different rRNA degrees of freedom. Relationships among organisms are then compositions of coexisting pathway-dependent similarities and dissimilarities, which cannot be described by a single hierarchy. We computationally test this hypothesis in comparative analyses of 16S and 23S rRNA sequence alignments by using a tensor decomposition, i.e., a framework for modeling composite data. Each alignment is encoded in a cuboid, i.e., a third-order tensor, where nucleotides, positions and organisms, each represent a degree of freedom. A tensor mode-1 higher-order singular value decomposition (HOSVD) is formulated such that it separates each cuboid into combinations of patterns of nucleotide frequency variation across organisms and positions, i.e., “eigenpositions” and corresponding nucleotide-specific segments of “eigenorganisms,” respectively, independent of a-priori knowledge of the taxonomic groups or rRNA structures. We find, in support of our hypothesis that, first, the significant eigenpositions reveal multiple similarities and dissimilarities among the taxonomic groups. Second, the corresponding eigenorganisms identify insertions or deletions of nucleotides exclusively conserved within the corresponding groups, that map out entire substructures and are enriched in adenosines, unpaired in the rRNA secondary structure, that participate in tertiary structure interactions. This demonstrates that structural motifs involved in rRNA folding and function are evolutionary degrees of freedom. Third, two previously unknown coexisting subgenic relationships between Microsporidia and Archaea are revealed in both the 16S and 23S rRNA alignments, a convergence and a divergence, conferred by insertions and deletions of these motifs, which cannot be described by a single hierarchy. This shows that mode-1 HOSVD modeling of rRNA alignments might be used to computationally predict evolutionary mechanisms

Texas ScholarWorks

Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints

Author: AV Uzilov
B Gulko
B Knudsen
B Knudsen
B Morgenstern
D Sankoff
DH Mathews
DH Mathews
DH Mathews
DKY Chiu
DS Fields
E Rivas
G Storz
I Holmes
I Holmes
I Holmes
IL Hofacker
IL Hofacker
IL Hofacker
J Gorodkin
J Gorodkin
J Gorodkin
J Reeder
J Wuyts
J Wuyts
JE Hopcroft
JE Tabaska
JH Havgaard
M Zuker
M Zuker
MS Waterman
NR Pace
O Perriquet
PP Gardner
R Durbin
R Giegerich
R Green
R Lück
R Nussinov
RD Dowell
RD Dowell
Robin D Dowell
RR Gutell
RR Gutell
RR Gutell
S Batzoglou
S Griffiths-Jones
Sean R Eddy
SR Eddy
SV Muse
V Juan
VR Akmaev
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: We are interested in the problem of predicting secondary structure for small sets of homologous RNAs, by incorporating limited comparative sequence information into an RNA folding model. The Sankoff algorithm for simultaneous RNA folding and alignment is a basis for approaches to this problem. There are two open problems in applying a Sankoff algorithm: development of a good unified scoring system for alignment and folding and development of practical heuristics for dealing with the computational complexity of the algorithm. RESULTS: We use probabilistic models (pair stochastic context-free grammars, pairSCFGs) as a unifying framework for scoring pairwise alignment and folding. A constrained version of the pairSCFG structural alignment algorithm was developed which assumes knowledge of a few confidently aligned positions (pins). These pins are selected based on the posterior probabilities of a probabilistic pairwise sequence alignment. CONCLUSION: Pairwise RNA structural alignment improves on structure prediction accuracy relative to single sequence folding. Constraining on alignment is a straightforward method of reducing the runtime and memory requirements of the algorithm. Five practical implementations of the pairwise Sankoff algorithm – this work (Consan), David Mathews' Dynalign, Ian Holmes' Stemloc, Ivo Hofacker's PMcomp, and Jan Gorodkin's FOLDALIGN – have comparable overall performance with different strengths and weaknesses

Springer - Publisher Connector

Public Library of Science (PLOS)

Digital Commons@Becker

Mutational Patterns in RNA Secondary Structure Evolution Examined in Three RNA Families

Author: Anuj Srivastava
AT Dandjinou
B Knudsen
C Lemieux
C Zwieb
C Zwieb
CR Woese
David Liberles
DR Maddison
E Rivas
ES Haas
ES Haas
F Ronquist
FJ Sun
GE Fox
H Ly
I Holmes
IL Hofacker
J Felsenstein
J Lingner
JA Hartigan
Jan Mrázek
JC Ellis
JD Podlevsky
JJ Cannone
JL Chen
JL Thorne
JL Thorne
JP Huelsenbeck
JR Cole
JS McCaskill
JW Brown
JW Brown
KP Williams
Liming Cai
LJ Collins
M Mccormickgraham
M Zuker
MT Dixon
NJ Savill
NM Krishnan
P Gueneau de Novoa
PD Williams
R Malmberg
RJ Klein
RK Bradley
RR Gutell
RR Gutell
Russell L. Malmberg
S Griffiths-Jones
S Smit
WP Maddison
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

The goal of this work was to study mutational patterns in the evolution of RNA secondary structure. We analyzed bacterial tmRNA, RNaseP and eukaryotic telomerase RNA secondary structures, mapping structural variability onto phylogenetic trees constructed primarily from rRNA sequences. We found that secondary structures evolve both by whole stem insertion/deletion, and by mutations that create or disrupt stem base pairing. We analyzed the evolution of stem lengths and constructed substitution matrices describing the changes responsible for the variation in the RNA stem length. In addition, we used principal component analysis of the stem length data to determine the most variable stems in different families of RNA. This data provides new insights into the evolution of RNA secondary structures and patterns of variation in the lengths of double helical regions of RNA molecules. Our findings will facilitate design of improved mutational models for RNA structure evolution

CiteSeerX

The Fragmented Mitochondrial Ribosomal RNAs of Plasmodium falciparum

Author: A Barta
A Ben-Shem
A Kairo
AB Vaidya
BF Lang
Bob Lightowlers
Bryan H. Sands
BT Wimberly
CA Milbury
CA Raabe
CH Slamovits
CJ Jackson
D Tu
DA Joy
DE Gillespie
DF Spencer
DH Rehkopf
DJ Conway
DJ Klein
DJ Klein
DT Dubin
DT Dubin
E Evguenieva-Hackenberg
F Schluenzen
GC Shukla
Germaine Tami
HJ Bernstein
J Brosius
J Cox-Singh
J Krungkrai
J Rabl
JA Mears
JA Mears
Jamie J. Cannone
JE Feagin
JE Feagin
JE Feagin
JE Feagin
Jean E. Feagin
JJ Cannone
JL Hansen
JL Hansen
JL Hansen
JL Hansen
JT Joseph
Jung C. Lee
K Hikosaka
K Hikosaka
K Suplick
Kevin J. Coe
KJ Dechering
M Fry
M Payne
M Turmel
Maria Isabel Harrell
MJ Gardner
MM Yusupov
MN Schnare
MN Schnare
MR Sharma
Murray N. Schnare
MV Rodnina
MW Gray
N Ban
N Ban
NR Voss
P Boer
P Chandramouli
P Chomczynski
P Nissen
P Nissen
P Sloof
PB Moore
R Kamikawa
R Okimoto
R Staden
RA Van Etten
RJ Wilson
Robin R. Gutell
RQ Lin
RR Gutell
RR Gutell
RR Gutell
S Jongwutiwes
S Krief
SL Perkins
T Powers
TA Steitz
TA Tatusova
TC White
TYK Heinonen
VF de la Cruz
VF de la Cruz
W Liu
W Trager
Y Ji
Publication venue: Public Library of Science
Publication date: 22/06/2012
Field of study

The mitochondrial genome in the human malaria parasite Plasmodium falciparum is most unusual. Over half the genome is composed of the genes for three classic mitochondrial proteins: cytochrome oxidase subunits I and III and apocytochrome b. The remainder encodes numerous small RNAs, ranging in size from 23 to 190 nt. Previous analysis revealed that some of these transcripts have significant sequence identity with highly conserved regions of large and small subunit rRNAs, and can form the expected secondary structures. However, these rRNA fragments are not encoded in linear order; instead, they are intermixed with one another and the protein coding genes, and are coded on both strands of the genome. This unorthodox arrangement hindered the identification of transcripts corresponding to other regions of rRNA that are highly conserved and/or are known to participate directly in protein synthesis.The identification of 14 additional small mitochondrial transcripts from P. falciparum and the assignment of 27 small RNAs (12 SSU RNAs totaling 804 nt, 15 LSU RNAs totaling 1233 nt) to specific regions of rRNA are supported by multiple lines of evidence. The regions now represented are highly similar to those of the small but contiguous mitochondrial rRNAs of Caenorhabditis elegans. The P. falciparum rRNA fragments cluster on the interfaces of the two ribosomal subunits in the three-dimensional structure of the ribosome.All of the rRNA fragments are now presumed to have been identified with experimental methods, and nearly all of these have been mapped onto the SSU and LSU rRNAs. Conversely, all regions of the rRNAs that are known to be directly associated with protein synthesis have been identified in the P. falciparum mitochondrial genome and RNA transcripts. The fragmentation of the rRNA in the P. falciparum mitochondrion is the most extreme example of any rRNA fragmentation discovered

FigShare

Predicting RNA secondary structure by the comparative approach: how to select the homologous sequences

Author: A Lescoute
AM Rosenblad
C Papanicolaou
C Woese
C Zwieb
C Zwieb
D Chiu
D Gautheret
D Mathews
D Matthews
D Sankoff
E Bindewald
F Rousset
F Tahi
F Tahi
Fariza Tahi
I Hofacker
J Brown
K Han
K Horimoto
L Vawter
M Szymanski
M Zuker
N Savill
O Perriquet
P Baldi
P Doty
P Higgs
PP Gardner
R Nussinov
RJ Klein
RR Gutell
S Freier
S Lindgreen
Stéfan Engelen
WC Curtis
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background The secondary structure of an RNA must be known before the relationship between its structure and function can be determined. One way to predict the secondary structure of an RNA is to identify covarying residues that maintain the pairings (Watson-Crick, Wobble and non-canonical pairings). This "comparative approach" consists of identifying mutations from homologous sequence alignments. The sequences must covary enough for compensatory mutations to be revealed, but comparison is difficult if they are too different. Thus the choice of homologous sequences is critical. While many possible combinations of homologous sequences may be used for prediction, only a few will give good structure predictions. This can be due to poor quality alignment in stems or to the variability of certain sequences. This problem of sequence selection is currently unsolved. Results This paper describes an algorithm, <it>SSCA</it>, which measures the suitability of sequences for the comparative approach. It is based on evolutionary models with structure constraints, particularly those on sequence variations and stem alignment. We propose three models, based on different constraints on sequence alignments. We show the results of the <it>SSCA </it>algorithm for predicting the secondary structure of several RNAs. <it>SSCA </it>enabled us to choose sets of homologous sequences that gave better predictions than arbitrarily chosen sets of homologous sequences. Conclusion <it>SSCA </it>is an algorithm for selecting combinations of RNA homologous sequences suitable for secondary structure predictions with the comparative approach.</p

HAL Evry

Springer - Publisher Connector

arXiv.org e-Print Archive

Massively Parallel RNA Chemical Mapping with a Reduced Bias MAP-seq Protocol

Author: A Fire
AM Zhelkovsky
B Langmead
B Langmead
C Ehresmann
C Guerrier-Takada
C Torchia
CW Greider
EE Regulski
EJ Merino
H Li
JB Lucks
JG Underwood
JJ Fritz
JN Zadeh
JP Bida
K Miura
M Kertesz
N Ban
NR Pace
P Cordero
P Rocca-Serra
P Sripakdeevong
RR Gutell
S Aviran
S Yoon
SA Mortimer
SA Mortimer
SC Flores
T Blondal
T Inoue
TR Cech
TW Li
W Kladwang
W Kladwang
Z Wang
Publication venue
Publication date: 03/04/2013
Field of study

Chemical mapping methods probe RNA structure by revealing and leveraging correlations of a nucleotide's structural accessibility or flexibility with its reactivity to various chemical probes. Pioneering work by Lucks and colleagues has expanded this method to probe hundreds of molecules at once on an Illumina sequencing platform, obviating the use of slab gels or capillary electrophoresis on one molecule at a time. Here, we describe optimizations to this method from our lab, resulting in the MAP-seq protocol (Multiplexed Accessibility Probing read out through sequencing), version 1.0. The protocol permits the quantitative probing of thousands of RNAs at once, by several chemical modification reagents, on the time scale of a day using a table-top Illumina machine. This method and a software package MAPseeker (http://simtk.org/home/map_seeker) address several potential sources of bias, by eliminating PCR steps, improving ligation efficiencies of ssDNA adapters, and avoiding problematic heuristics in prior algorithms. We hope that the step-by-step description of MAP-seq 1.0 will help other RNA mapping laboratories to transition from electrophoretic to next-generation sequencing methods and to further reduce the turnaround time and any remaining biases of the protocol.Comment: 22 pages, 5 figure

Novel high-rank phylogenetic lineages within a sulfur spring (Zodletone Spring, Oklahoma), revealed using a combined pyrosequencing-Sanger approach

Author: Ahn J
Andersson AF
Bibby K
Brown MV
Callbeck C
Dalevi D
Donachie SP
Dumbrell AJ
Engel AS
Gilbert JA
Gilbert JA
Giongo A
Gutell RR
Halm H
He J
Heijs SK
Hoff KJ
Hollister EB
Huse SM
Jones RT
Kataoka T
Kim Y-S
Kirchman DL
Liu FH
Liu Z
Monchy S
Noller HF
Noller HF
Posada D
Schloss PD
Schloss PD
Schütte UME
Senko JM
Stern S
Tringe SG
Uroz S
Webster NS
Wimberly BT
Woese Gutell RR
Ye L
Youssef NH
Youssef NH
Publication venue: 'American Society for Microbiology'
Publication date: 01/04/2012
Field of study

The utilization of high-throughput sequencing technologies in 16S rRNA gene-based diversity surveys has indicated that within most ecosystems, a significant fraction of the community could not be assigned to known microbial phyla. Accurate determination of the phylogenetic affiliation of such sequences is difficult due to the short-read-length output of currently available high-throughput technologies. This fraction could harbor multiple novel phylogenetic lineages that have so far escaped detection. Here we describe our efforts in accurate assessment of the novelty and phylogenetic affiliation of selected unclassified lineages within a pyrosequencing data set generated from source sediments of Zodletone Spring, a sulfide- and sulfur-rich spring in southwestern Oklahoma. Lineage-specific forward primers were designed for 78 putatively novel lineages identified within the pyrosequencing data set, and representative nearly full-length small-subunit (SSU) rRNA gene sequences were obtained by pairing those primers with reverse universal bacterial primers. Of the 78 lineages tested, amplifiable products were obtained for 52, 32 of which had at least one nearly full-length sequence that was representative of the lineage targeted. Analysis of phylogenetic affiliation of the obtained Sanger sequences identified 5 novel candidate phyla and 10 novel candidate classes (within Fibrobacteres, Planctomycetes, and candidate phyla BRC1, GN12, TM6, TM7, LD1, WS2, and GN06) in the data set, in addition to multiple novel orders and families. The discovery of multiple novel phyla within a pilot study of a single ecosystem clearly shows the potential of the approach in identifying novel diversities within the rare biosphere.Peer reviewedMicrobiology and Molecular Genetic

SHAREOK repository

Incorporating phylogenetic-based covarying mutations into RNAalifold for RNA consensus structure prediction

Author: A Esquela-Kerscher
AO Harmanci
B Gulko
B Knudsen
B Knudsen
C Workman
CB Do
CM Croce
Consortium The ENCODE Project
CR Woese
D Sankoff
DKY Chiu
DL Swofford
E Rivas
F Xia
IL Hofacker
IL Hofacker
J Felsenstein
JA Jaeger
JH Havgaard
JP Huelsenbeck
JS Mattick
JS Pedersen
L He
M Mandal
M Zuker
M Zuker
MA Larkin
MS Nicoloso
MS Waterman
Ping Ge
PP Gardner
PP Gardner
R Lorenz
R Nussinov
RD Dowell
RJ Klein
RR Gutell
RR Gutell
RR Sokal
S Washietl
S Will
SE Seemann
SE Seemann
SH Bernhart
Shaojie Zhang
SR Eddy
SR Eddy
The FANTOM Consortium
TR Mercer
WM Fitch
Y Sakakibara
Z Yao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study